Regional Variation of Domain-Specific Lexical Items: Toward a Pan-Chinese Lexical Resource

نویسندگان

  • Oi Yee Kwong
  • Benjamin Ka-Yin T'sou
چکیده

This paper reports on an initial and necessary step toward the construction of a Pan-Chinese lexical resource. We investigated the regional variation of lexical items in two specific domains, finance and sports; and explored how much of such variation is covered in existing Chinese synonym dictionaries, in particular the Tongyici Cilin. The domain-specific lexical items were obtained from subsections of a synchronous Chinese corpus, LIVAC. Results showed that 20-40% of the words from various subcorpora are unique to the individual communities, and as much as 70% of such unique items are not yet covered in the Tongyici Cilin. The results suggested great potential for building a Pan-Chinese lexical resource for Chinese language processing. Our next step would be to explore automatic means for extracting related lexical items from the corpus, and to incorporate them into existing semantic classifications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Toward a Pan-Chinese Thesaurus

In this paper, we propose a corpus-based approach to the construction of a Pan-Chinese lexical resource, starting out with the aim to enrich existing Chinese thesauri in the Pan-Chinese context. The resulting thesaurus is thus expected to contain not only the core senses and usages of Chinese lexical items but also usages specific to individual Chinese speech communities. We introduce the ratio...

متن کامل

Extending a Thesaurus with Words from Pan-Chinese Sources

In this paper, we work on extending a Chinese thesaurus with words distinctly used in various Chinese communities. The acquisition and classification of such region-specific lexical items is an important step toward the larger goal of constructing a Pan-Chinese lexical resource. In particular, we extend a previous study in three respects: (1) to improve automatic classification by removing dupl...

متن کامل

Extending a Thesaurus in the Pan-Chinese Context

In this paper, we address a unique problem in Chinese language processing and report on our study on extending a Chinese thesaurus with region-specific words, mostly from the financial domain, from various Chinese speech communities. With the larger goal of automatically constructing a Pan-Chinese lexical resource, this work aims at taking an existing semantic classificatory structure as levera...

متن کامل

Impact of Density and Distribution of Unfamiliar Lexical Items on Iranian EFL Learners’ Successful Reading Comprehension Achievement

Density and distribution of Unfamiliar Lexical Items (ULIs) appear to influence learners’ Reading Comprehension Achievement (RCA). This study concerns the impact of these two variables on Iranian EFL learners’ RCA. For this, two groups of students timetabled for the experiments designed to assess learners’ RCA. To determine the participants’ levels of proficiency a Quick Proficiency Test was fi...

متن کامل

Impact of Density and Distribution of Unfamiliar Lexical Items on Iranian EFL Learners’ Successful Reading Comprehension Achievement

Density and distribution of Unfamiliar Lexical Items (ULIs) appear to influence learners’ Reading Comprehension Achievement (RCA). This study concerns the impact of these two variables on Iranian EFL learners’ RCA. For this, two groups of students timetabled for the experiments designed to assess learners’ RCA. To determine the participants’ levels of proficiency a Quick Proficiency Test was fi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006